Spatial data visualisation

Author

Tatiana Proboste

What we will learn

In order to explore spatial data and understand it better, we will focus on basic analysis, spatial statistics and mapping. The goal of today’s visualisation session is to equip you with the necessary tools to: ’

• Access open data

• Perform transformations, CRS, extraction, and calculation on it

• Carry out spatial data wrangling.

Organize our data

Before we start working in R, let’s create a new folder with the name of our project (spatial visualisation) and within this folder, we will create 3 new folders:

  • raw_data
  • outputs
  • scripts

Load libraries

The sf package (Pebesma, E., 2022) stands for Simple Features.

Another important package that we want to explore: tmap is a package to plot our maps

Packages needed for the analysis

library(readr) 
library(tidyverse) # cleaning, wrangling 
library(sf) # spatial manipulation
library(tmap) # map visualisations
library(leaflet) # leaflet maps 
library(ozmaps) # for maps from Australia 
library(terra) # spatial data analysis with raster  
library(geodata)
#install.packages("remotes") 
#remotes::install_github("wfmackey/absmapsdata")

Introduction

In this session, we will explore the basic function and package required to visualize geospatial data using information available from Farm Transparency (https://www.farmtransparency.org/). The data available provides details about the locations of different types of farms across Australia. Our objective is to create a map that shows the location of some specific farms in Australia and in Queensland, and we will calculate how many farms we have per SA2 and per 100,000 people. The data is available in an Excel file with latitude and longitude information. First, we need to access the data and filter out unwanted information. Once this is done, we can map the points where the farms are located using a shape file with Australian boundaries. To calculate the density of farms per 100,000 people, we need census data. For this exercise, we will use the 2016 census data as a package available to download this information. We will calculate the number of farms in each postcode and divide it by the population. Finally, we will brifly investigate if the location of farms in relation with the minimum temperatures.

1. Accessing open data

Ozmap is a package that give us a shape file for Australian boundaries.

library(ozmaps)
aus <- ozmaps::ozmap('abs_ced') # Map of Australia  
library(Census2016)
census <- Census2016::Census2016_wide_by_SA2_year
census <- census %>% filter(year == 2016)
library(absmapsdata)
absmapsdata::absmapsdata_file_list
aus2016 <- absmapsdata::sa22016

2. Load farm transparency data

The data frame contains information about different production systems across Australia. So now, we need to filter only the type of farms of our interest within QLD. But first, we clean the name of the variables using the function clean_names from janitor package.

ft <- janitor::clean_names(ft) 

3. Explore the data - visualisation

First, we plot our data in the space without converting it to a spatial data frame. We will also colour by type of farm.

Point: using ggplot

ggplot(ft, aes(x=lng, y=lat, color= type))+ 
  geom_point()+  
  theme_bw() 

We can also use interactive functions from plotly. This package allows us to check the information for each point.

plotly::ggplotly(
  ggplot(ft, aes(x=lng, y=lat, color= type))+ 
  geom_point()+  
  theme_bw() 
) 

Point: using tmap

If we try to use the out farm data, we can see that it is a tibble if we used str(ft), so if we use tmap, this is not going to work

tmap::tm_shape(ft)+
  tm_dots()
Error: Object ft is neither from class sf, stars, Spatial, Raster, nor SpatRaster.

We need to transform our data frame to a spatial data frame using the function st_as_sf().

ft_s <-  sf::st_as_sf(ft,coords = c('lng', 'lat'), crs = 4326) # WGS 84

Before we start mapping, we need to check the Coordinate Reference System (CRS) of the spatial data frame. Coordinate Reference System (CRS) define how the spatial elements of the data relate to the surface of the Earth.

To change the geometries we will use st_tranform()

tmap_mode('view')  # use 'plot' to turn off the interactive view
tmap mode set to interactive viewing
tm_shape(ft_s)+  
  tm_dots(col = 'type', size = 0.1, 
          title = "farms in Australia", 
          palette = "RdBu" )
Warning: Number of levels of the variable "type" is 39, which is larger than
max.categories (which is 30), so levels are combined. Set
tmap_options(max.categories = 39) in the layer function to show all levels.
tmap_mode('plot')  # turn off the interactive view of tmap
tmap mode set to plotting

Now, we will focus only on QLD

qld <- aus2016 %>%
  filter(state_name_2016=="Queensland")

We do not need all of those variables, so we will keep only SA2_code_201

qld <- qld %>% 
  dplyr::select(sa2_code_2016)

plot(qld)

This will also give as an error due to empty units.

tm_shape(qld)+
  tm_polygons()
Warning: The shape qld contains empty units.
Error in `$<-.data.frame`(`*tmp*`, "SHAPE_AREAS", value = c(13.6556008964765, : replacement has 528 rows, data has 530

To correct this, we can use st_is_empty()

Then we can remove all the empty units from our shape file

qld <- qld %>% filter (!st_is_empty(.))
tm_shape(qld)+
  tm_polygons()

Make a more map for QLD and with scale and compass

tmap_style('white') #natural, cobalt
tmap style set to "white"
other available styles are: "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "classic", "watercolor" 
tm_shape(qld) + 
  tm_polygons(col = 'grey10', alpha = 0.1)+
  tm_shape(ft_s)+ 
  tm_dots(col = 'type',      
          size = 0.1,  
          title = "Farms in QLD",        
          palette = "RdBu") +  
  tm_layout(legend.position = c('right', 'top'),   
            legend.outside = TRUE)+ 
  tm_compass(position = c('right', 'top'))+ 
  tm_grid(lines = F)+   
  tm_scale_bar(text.size = 1, position = c('left', 'bottom'))
Warning: Number of levels of the variable "type" is 39, which is larger than
max.categories (which is 30), so levels are combined. Set
tmap_options(max.categories = 39) in the layer function to show all levels.
Some legend labels were too wide. These labels have been resized to 0.46, 0.53, 0.54, 0.58, 0.58, 0.47. Increase legend.width (argument of tm_layout) to make the legend wider and therefore the labels larger.

4. Data calculation

Now lets focus on only farms in QLD and calculate how many famrs we have per 100,000 people

we need the number of farms per SA2

This code will give us an error because farms and census_qld have different CRS

farms_sa2 <- farms %>% st_join(census_qld)
Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared, : st_crs(x) == st_crs(y) is not TRUE

change the CRS

join the information from the farms with the population per SA2 (based on census 2016)

add the new column of farms ‘incidence’

more maps

tm_shape(census_qld)+
  tm_polygons(col = 'farms')+
  tm_shape(farms_sa2)+
  tm_dots()

tm_shape(census_qld) + 
  tm_polygons() +
  tm_bubbles(size = "farms")

tmap_style("natural")
tmap style set to "natural"
other available styles are: "white", "gray", "cobalt", "col_blind", "albatross", "beaver", "bw", "classic", "watercolor" 
tm_shape(census_qld) +
  tm_polygons("farms", title = "Farms", style = "cont") +
  tm_layout(legend.outside = TRUE) +
  tm_scale_bar(position = c("left", "bottom")) + # add scale bar to the top right
  tm_compass(type = "arrow",
             position = c("right", "top"))

Bonus map (author: Dr Caitie Kuempel)

#Leaflet
library(shiny)
library(leaflet)

# color palette 
pal <- 
  colorBin(
    palette = "YlOrRd",
    domain = census_qld$farms) #change to your data here

# pop up message
labels <- 
  sprintf(
    "<strong>%s</strong><br/>%g",
  census_qld$sa2_name,  census_qld$farms) %>% #change to your data here
  lapply(htmltools::HTML)

shinyApp(
  ui <- navbarPage("Leaflet", id="nav", 
                   # a tab for the map 
                   tabPanel(
                     "Interactive map",
                     shinycssloaders::withSpinner(leafletOutput(
                       outputId = "mymap", 
                       width = "900px", 
                       height = "500px"))),
                   # A tab to explore the data in table format
                   tabPanel("Explore the data",
                            DT::dataTableOutput("table"))
  ),
  
  server <- function(input, output) {
    
    # map panel 
    output$mymap <- renderLeaflet({
      
      # passing the shp df to leaflet
      leaflet(census_qld) %>% #change to your data here
        # zooming in on Brisbane
        setView(153.0260, -27.4705, 8) %>% # long/lat
        # adding tiles, without labels to minimize clutter
        addProviderTiles("CartoDB.PositronNoLabels") %>%
        # parameters for the polygons
        addPolygons(
          fillColor = ~pal(farms), 
          weight = 1,
          opacity = 1,
          color = "white",
          fillOpacity = 0.7,
          highlight = highlightOptions(
            weight = 2,
            color = "#666",
            fillOpacity = 0.7,
            bringToFront = TRUE),
          label = labels,
          labelOptions = labelOptions(
            style = list("font-weight" = "normal"),
            textsize = "15px",
            direction = "auto")) %>%
        # legend
        addLegend(pal = pal,
                  values = census_qld$farms, #change to your data here
                  position = "bottomright",
                  title = "Farms", #change to your data here
                  opacity = 0.8,
                  na.label = "No data")
    })
    
    # data panel
    output$table <- DT::renderDataTable({
      DT::datatable(pop_shp %>% st_drop_geometry(), rownames = F,  filter = 'top', #change to your data here
                    extensions = c('Buttons', 'FixedHeader', 'Scroller'),
                    options = list(pageLength = 15, lengthChange = F,
                                   fixedHeader = TRUE,
                                   dom = 'lfBrtip',
                                   list('copy', 'print', list(
                                     extend = 'collection',
                                     buttons = c('csv', 'excel', 'pdf'),
                                     text = 'Download'
                                   ))
                    ))
    })
    
  }
  
  ,
  
  options = list(height = 700)
)

5. Raster

The pakage geodata contains information from WorldClim

For raster, we will use terra package, which is similar than the old raster package, but is more adaptable and faster than the old package and is made by the same people.

here we will crop the raster of temperature from Australia to only the are of QLD

tmin_qld <- terra::crop(australia, qld)

tm_shape(tmin_qld)+
  tm_raster()
stars object downsampled to 883 by 1133 cells. See tm_shape manual (argument raster.downsample)

points <- ft %>% select(lng, lat,type, state) %>% 
  filter(state =='QLD' & 
           type == c('Pigs', 'Dairy', 'Broiler (Meat) Chickens',
                   'Cattle (Beef)', 'Horse Racing')) %>% 
  sf::st_as_sf(coords = c('lng', 'lat'), crs = 4326) # WGS 84
Warning: There was 1 warning in `filter()`.
ℹ In argument: `&...`.
Caused by warning in `type == c("Pigs", "Dairy", "Broiler (Meat) Chickens", "Cattle (Beef)",
    "Horse Racing")`:
! longer object length is not a multiple of shorter object length

Extract the minimum temperature for each farm.

points$tmin <- terra::extract(tmin_qld, points)

and then we can explore some diferences between the different farms and the temperature from where the farms are located.

6. Spatial Analysis

Perform linear regression model1

References

sf Package Documentation

Introduction to mapping in R

Plotting simple features

tmap: get started!